Shubhorup Biswas comments on [missing post]

Shubhorup Biswas 19 Mar 2026 1:58 UTC
1 point
0
which proxies to train against.
https://www.lesswrong.com/posts/G9HdpyREaCbFJjKu5/it-is-reasonable-to-research-how-to-use-model-internals-in?commentId=krg2jzDxXhei9vNLj

and Daniel Kokotajlo comment about preserving at least one output stream that isn’t optimised against(this could be activations, while doing cot+output monitoring)